Personalized Machine Translation: Preserving Original Author Traits

نویسندگان

  • Shuly Wintner
  • Shachar Mirkin
  • Lucia Specia
  • Ella Rabinovich
  • Raj Nath Patel
چکیده

The language that we produce reflects our personality, and various personal and demographic characteristics can be detected in natural language texts. We focus on one particular personal trait of the author, gender, and study how it is manifested in original texts and in translations. We show that author’s gender has a powerful, clear signal in originals texts, but this signal is obfuscated in human and machine translation. We then propose simple domainadaptation techniques that help retain the original gender traits in the translation, without harming the quality of the translation, thereby creating more personalized machine translation systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Motivating Personality-aware Machine Translation

Language use is known to be influenced by personality traits as well as by sociodemographic characteristics such as age or mother tongue. As a result, it is possible to automatically identify these traits of the author from her texts. It has recently been shown that knowledge of such dimensions can improve performance in NLP tasks such as topic and sentiment modeling. We posit that machine tran...

متن کامل

Personalized Machine Translation: Predicting Translational Preferences

Machine Translation (MT) has advanced in recent years to produce better translations for clients’ specific domains, and sophisticated tools allow professional translators to obtain translations according to their prior edits. We suggest that MT should be further personalized to the end-user level – the receiver or the author of the text – as done in other applications. As a step in that directi...

متن کامل

Improved Domain Adaptation for Statistical Machine Translation

We present a simple and effective infrastructure for domain adaptation for statistical machine translation (MT). To build MT systems for different domains, it trains, tunes and deploys a single translation system that is capable of producing adapted domain translations and preserving the original generic accuracy at the same time. The approach unifies automatic domain detection and domain model...

متن کامل

Preserving Discourse Structure when Simplifying Text

Text simplification involves restructuring sentences by replacing particular syntactic constructs (like embedded clauses and appositives). The aim is to make the text easier to read for some target group (like aphasics and people with low reading ages) or easier to process by some program (like a parser or machine translation system). However, sentencelevel syntactic restructuring can wreak hav...

متن کامل

CORRIGENDUM: An Exhaustive Epistatic SNP Association Analysis on Expanded Wellcome Trust Data

The authors have noticed that the original version of this Article contained a typographical error in the spelling of the author Hoifung Poon which was incorrectly given as Hoifung Poong. Furthermore, the author Jeff Baxter was incorrectly listed as Scott Baxter. These changes have now been corrected in both the PDF and HTML versions of the Article. SUBJECT AREAS: STATISTICAL METHODS MACHINE LE...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017